Universal Morphology for Old Hungarian

نویسندگان

  • Eszter Simon
  • Veronika Vincze
چکیده

This paper provides a description of the automatic conversion of the morphologically annotated part of the Old Hungarian Corpus. These texts are in the format of the Humor analyzer, which does not follow any international standards. Since standardization always facilitates future research, even for researchers who do not know the Old Hungarian language, we opted for mapping the Humor formalism to a widely used universal tagset, namely the Universal Dependencies framework. The benefits of using a shared tagset across languages enable interlingual comparisons from a theoretical point of view and also multilingual NLP applications can profit from a unified annotation scheme. In this paper, we report the adaptation of the Universal Dependencies morphological annotation scheme to Old Hungarian, and we discuss the most important theoretical linguistic issues that had to be resolved during the process. We focus on the linguistic phenomena typical of Old Hungarian that required special treatment and we offer solutions to them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Hungarian Morphology

The aim of this study is to provide an autosegmental description of Hungarian morphology. Chapter 1 sketches the (meta)theoretical background and summarizes the main argument. In Chapter 2 phonological prerequisites to morphological analysis are discussed. Special attention is paid to Hungarian vowel harmony. In Chapter 3 a universal theory of lexical categories is proposed, and the category sy...

متن کامل

Universal Dependencies and Morphology for Hungarian - and on the Price of Universality

In this paper, we present how the principles of universal dependencies and morphology have been adapted to Hungarian. We report the most challenging grammatical phenomena and our solutions to those. On the basis of the adapted guidelines, we have converted and manually corrected 1,800 sentences from the Szeged Treebank to universal dependency format. We also introduce experiments on this manual...

متن کامل

Morphological annotation of Old and Middle Hungarian corpora

In our paper, we present a computational morphology for Old and Middle Hungarian used in two research projects that aim at creating morphologically annotated corpora of Old and Middle Hungarian. In addition, we present the web-based disambiguation tool used in the semi-automatic disambiguation of the annotations and the structured corpus query tool that has a unique but very useful feature of m...

متن کامل

A New Integrated Open-source Morphological Analyzer for Hungarian

The goal of a Hungarian research project has been to create an integrated Hungarian natural language processing framework. This infrastructure includes tools for analyzing Hungarian texts, integrated into a standardized environment. The morphological analyzer is one of the core components of the framework. The goal of this paper is to describe a fast and customizable morphological analyzer and ...

متن کامل

Brill’s Pos Tagger with Extended Lexical Templates for Hungarian

In this paper Brill’s rule-based PoS tagger is tested and adapted to Hungarian. It is shown that the present system does not obtain as high accuracy for Hungarian as it does for English because of the structural difference between these languages. Hungarian has rich morphology, is agglutinative with inflectional characteristics and has free word order. The tagger has the greatest difficulties w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016